76 research outputs found

    Interpreting a Classical Geometric Proof with Interactive Realizability.

    Get PDF
    We show how to extract a monotonic learning algorithm from a classical proof of a geometric statement by interpreting the proof by means of interactive realizability, a realizability sematics for classical logic. The statement is about the existence of a convex angle including a finite collections of points in the real plane and it is related to the existence of a convex hull. We define real numbers as Cauchy sequences of rational numbers, therefore equality and ordering are not decidable. While the proof looks superficially constructive, it employs classical reasoning to handle undecidable comparisons between real numbers, making the underlying algorithm non-effective. The interactive realizability interpretation transforms the non-effective linear algorithm described by the proof into an effective one that uses backtracking to learn from its mistakes. The effective algorithm exhibits a "smart" behavior, performing comparisons only up to the precision required to prove the final statement. This behavior is not explicitly planned but arises from the interactive interpretation of comparisons between Cauchy sequences

    Challenges in predicting stabilizing variations: An exploration

    Get PDF
    An open challenge of computational and experimental biology is understanding the impact of non-synonymous DNA variations on protein function and, subsequently, human health. The effects of these variants on protein stability can be measured as the difference in the free energy of unfolding (ΔΔG) between the mutated structure of the protein and its wild-type form. Throughout the years, bioinformaticians have developed a wide variety of tools and approaches to predict the ΔΔG. Although the performance of these tools is highly variable, overall they are less accurate in predicting ΔΔG stabilizing variations rather than the destabilizing ones. Here, we analyze the possible reasons for this difference by focusing on the relationship between experimentally-measured ΔΔG and seven protein properties on three widely-used datasets (S2648, VariBench, Ssym) and a recently introduced one (S669). These properties include protein structural information, different physical properties and statistical potentials. We found that two highly used input features, i.e., hydrophobicity and the Blosum62 substitution matrix, show a performance close to random choice when trying to separate stabilizing variants from either neutral or destabilizing ones. We then speculate that, since destabilizing variations are the most abundant class in the available datasets, the overall performance of the methods is higher when including features that improve the prediction for the destabilizing variants at the expense of the stabilizing ones. These findings highlight the need of designing predictive methods able to exploit also input features highly correlated with the stabilizing variants. New tools should also be tested on a not-artificially balanced dataset, reporting the performance on all the three classes (i.e., stabilizing, neutral and destabilizing variants) and not only the overall results

    Protein Stability Perturbation Contributes to the Loss of Function in Haploinsufficient Genes

    Get PDF
    Missense variants are among the most studied genome modifications as disease biomarkers. It has been shown that the \u201cperturbation\u201d of the protein stability upon a missense variant (in terms of absolute \u394\u394G value, i.e., |\u394\u394G|) has a significant, but not predictive, correlation with the pathogenicity of that variant. However, here we show that this correlation becomes significantly amplified in haploinsufficient genes. Moreover, the enrichment of pathogenic variants increases at the increasing protein stability perturbation value. These findings suggest that protein stability perturbation might be considered as a potential cofactor in diseases associated with haploinsufficient genes reporting missense variants

    How good Neural Networks interpretation methods really are? A quantitative benchmark

    Full text link
    Saliency Maps (SMs) have been extensively used to interpret deep learning models decision by highlighting the features deemed relevant by the model. They are used on highly nonlinear problems, where linear feature selection (FS) methods fail at highlighting relevant explanatory variables. However, the reliability of gradient-based feature attribution methods such as SM has mostly been only qualitatively (visually) assessed, and quantitative benchmarks are currently missing, partially due to the lack of a definite ground truth on image data. Concerned about the apophenic biases introduced by visual assessment of these methods, in this paper we propose a synthetic quantitative benchmark for Neural Networks (NNs) interpretation methods. For this purpose, we built synthetic datasets with nonlinearly separable classes and increasing number of decoy (random) features, illustrating the challenge of FS in high-dimensional settings. We also compare these methods to conventional approaches such as mRMR or Random Forests. Our results show that our simple synthetic datasets are sufficient to challenge most of the benchmarked methods. TreeShap, mRMR and LassoNet are the best performing FS methods. We also show that, when quantifying the relevance of a few non linearly-entangled predictive features diluted in a large number of irrelevant noisy variables, neural network-based FS and interpretation methods are still far from being reliable

    DDGun: an untrained predictor of protein stability changes upon amino acid variants

    Get PDF
    Estimating the functional effect of single amino acid variants in proteins is fundamental for predicting the change in the thermodynamic stability, measured as the difference in the Gibbs free energy of unfolding, between the wild-type and the variant protein (ΔΔG). Here, we present the web-server of the DDGun method, which was previously developed for the ΔΔG prediction upon amino acid variants. DDGun is an untrained method based on basic features derived from evolutionary information. It is antisymmetric, as it predicts opposite ΔΔG values for direct (A → B) and reverse (B → A) single and multiple site variants. DDGun is available in two versions, one based on only sequence information and the other one based on sequence and structure information. Despite being untrained, DDGun reaches prediction performances comparable to those of trained methods. Here we make DDGun available as a web server. For the web server version, we updated the protein sequence database used for the computation of the evolutionary features, and we compiled two new data sets of protein variants to do a blind test of its performances. On these blind data sets of single and multiple site variants, DDGun confirms its prediction performance, reaching an average correlation coefficient between experimental and predicted ΔΔG of 0.45 and 0.49 for the sequence-based and structure-based versions, respectively. Besides being used for the prediction of ΔΔG, we suggest that DDGun should be adopted as a benchmark method to assess the predictive capabilities of newly developed methods. Releasing DDGun as a web-server, stand-alone program and docker image will facilitate the necessary process of method comparison to improve ΔΔG prediction

    QueryOR: a comprehensive web platform for genetic variant analysis and prioritization

    Get PDF
    Background: Whole genome and exome sequencing are contributing to the extraordinary progress in the study of human genetic variants. In this fast developing field, appropriate and easily accessible tools are required to facilitate data analysis. Results: Here we describe QueryOR, a web platform suitable for searching among known candidate genes as well as for finding novel gene-disease associations. QueryOR combines several innovative features that make it comprehensive, flexible and easy to use. Instead of being designed on specific datasets, it works on a general XML schema specifying formats and criteria of each data source. Thanks to this flexibility, new criteria can be easily added for future expansion. Currently, up to 70 user-selectable criteria are available, including a wide range of gene and variant features. Moreover, rather than progressively discarding variants taking one criterion at a time, the prioritization is achieved by a global positive selection process that considers all transcript isoforms, thus producing reliable results. QueryOR is easy to use and its intuitive interface allows to handle different kinds of inheritance as well as features related to sharing variants in different patients. QueryOR is suitable for investigating single patients, families or cohorts. Conclusions: QueryOR is a comprehensive and flexible web platform eligible for an easy user-driven variant prioritization. It is freely available for academic institutions at http://queryor.cribi.unipd.it/

    Deep learning methods to predict amyotrophic lateral sclerosis disease progression

    Get PDF
    Amyotrophic lateral sclerosis (ALS) is a highly complex and heterogeneous neurodegenerative disease that affects motor neurons. Since life expectancy is relatively low, it is essential to promptly understand the course of the disease to better target the patient’s treatment. Predictive models for disease progression are thus of great interest. One of the most extensive and well-studied open-access data resources for ALS is the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) repository. In 2015, the DREAM-Phil Bowen ALS Prediction Prize4Life Challenge was held on PRO-ACT data, where competitors were asked to develop machine learning algorithms to predict disease progression measured through the slope of the ALSFRS score between 3 and 12 months. However, although it has already been successfully applied in several studies on ALS patients, to the best of our knowledge deep learning approaches still remain unexplored on the ALSFRS slope prediction in PRO-ACT cohort. Here, we investigate how deep learning models perform in predicting ALS progression using the PRO-ACT data. We developed three models based on different architectures that showed comparable or better performance with respect to the state-of-the-art models, thus representing a valid alternative to predict ALS disease progression

    Usefulness of a Hepatitis B Surface Antigen-Based Model for the Prediction of Functional Cure in Patients with Chronic Hepatitis B Virus Infection Treated with Nucleos(t)ide Analogues: A Real-World Study

    Get PDF
    In patients with chronic hepatitis B (CHB) under long-term treatment with nucleso(t)ide analogues (NAs), the loss of hepatitis B surface antigen (HBsAg) is a rare event. A growing body of evidence supports the use of quantitative HBsAg for the prediction of functional cure, although these results are mainly derived from studies performed on Asian patients with hepatitis B e antigen (HBeAg)-positive CHB. Here, we investigated the clinical role of quantitative HBsAg in a real-life cohort of CHB patients under treatment with NAs in a tertiary care center from North-West Italy. A total of 101 CHB patients (HBeAg-negative, n = 86) undergoing NAs treatment were retrospectively enrolled. HBsAg was measured at baseline (T0), 6 months (T1), 12 months (T2) and at the last follow-up (FU). Median FU was 5.5 (3.2–8.3) years; at the end of FU, 11 patients lost the HBsAg (annual incidence rate = 1.8%). Baseline HBsAg levels were significantly different between patients with no HBsAg loss and those achieving a functional cure (3.46, 2.91–3.97 vs. 1.11, 0.45–1.98 Log IU/mL, p < 0.001). Similarly, the HBsAg decline (Δ) from T0 to T2 was significantly different between the two groups of patients (0.05, −0.04–0.13, vs. 0.38, 0.11–0.80 Log IU/mL, p = 0.002). By stratified cross-validation analysis, the combination of baseline HBsAg and ΔHBsAg T0–T2 showed an excellent accuracy for the prediction of HBsAg loss (C statistic = 0.966). These results corroborate the usefulness of quantitative HBsAg in Caucasian CHB patients treated with antivirals for the prediction of HBsAg seroclearance
    • …
    corecore